104 research outputs found

    Information Bottlenecks, Causal States, and Statistical Relevance Bases: How to Represent Relevant Information in Memoryless Transduction

    Full text link
    Discovering relevant, but possibly hidden, variables is a key step in constructing useful and predictive theories about the natural world. This brief note explains the connections between three approaches to this problem: the recently introduced information-bottleneck method, the computational mechanics approach to inferring optimal models, and Salmon's statistical relevance basis.Comment: 3 pages, no figures, submitted to PRE as a "brief report". Revision: added an acknowledgements section originally omitted by a LaTeX bu

    Beyond Word N-Grams

    Full text link
    We describe, analyze, and evaluate experimentally a new probabilistic model for word-sequence prediction in natural language based on prediction suffix trees (PSTs). By using efficient data structures, we extend the notion of PST to unbounded vocabularies. We also show how to use a Bayesian approach based on recursive priors over all possible PSTs to efficiently maintain tree mixtures. These mixtures have provably and practically better performance than almost any single model. We evaluate the model on several corpora. The low perplexity achieved by relatively small PST mixture models suggests that they may be an advantageous alternative, both theoretically and practically, to the widely used n-gram models.Comment: 15 pages, one PostScript figure, uses psfig.sty and fullname.sty. Revised version of a paper in the Proceedings of the Third Workshop on Very Large Corpora, MIT, 199

    Objective Classification of Galaxy Spectra using the Information Bottleneck Method

    Get PDF
    A new method for classification of galaxy spectra is presented, based on a recently introduced information theoretical principle, the `Information Bottleneck'. For any desired number of classes, galaxies are classified such that the information content about the spectra is maximally preserved. The result is classes of galaxies with similar spectra, where the similarity is determined via a measure of information. We apply our method to approximately 6000 galaxy spectra from the ongoing 2dF redshift survey, and a mock-2dF catalogue produced by a Cold Dark Matter-based semi-analytic model of galaxy formation. We find a good match between the mean spectra of the classes found in the data and in the models. For the mock catalogue, we find that the classes produced by our algorithm form an intuitively sensible sequence in terms of physical properties such as colour, star formation activity, morphology, and internal velocity dispersion. We also show the correlation of the classes with the projections resulting from a Principal Component Analysis.Comment: submitted to MNRAS, 17 pages, Latex, with 14 figures embedde

    Using state space differential geometry for nonlinear blind source separation

    Full text link
    Given a time series of multicomponent measurements of an evolving stimulus, nonlinear blind source separation (BSS) seeks to find a "source" time series, comprised of statistically independent combinations of the measured components. In this paper, we seek a source time series with local velocity cross correlations that vanish everywhere in stimulus state space. However, in an earlier paper the local velocity correlation matrix was shown to constitute a metric on state space. Therefore, nonlinear BSS maps onto a problem of differential geometry: given the metric observed in the measurement coordinate system, find another coordinate system in which the metric is diagonal everywhere. We show how to determine if the observed data are separable in this way, and, if they are, we show how to construct the required transformation to the source coordinate system, which is essentially unique except for an unknown rotation that can be found by applying the methods of linear BSS. Thus, the proposed technique solves nonlinear BSS in many situations or, at least, reduces it to linear BSS, without the use of probabilistic, parametric, or iterative procedures. This paper also describes a generalization of this methodology that performs nonlinear independent subspace separation. In every case, the resulting decomposition of the observed data is an intrinsic property of the stimulus' evolution in the sense that it does not depend on the way the observer chooses to view it (e.g., the choice of the observing machine's sensors). In other words, the decomposition is a property of the evolution of the "real" stimulus that is "out there" broadcasting energy to the observer. The technique is illustrated with analytic and numerical examples.Comment: Contains 14 pages and 3 figures. For related papers, see http://www.geocities.com/dlevin2001/ . New version is identical to original version except for URL in the bylin

    A Bivariate Measure of Redundant Information

    Get PDF
    We define a measure of redundant information based on projections in the space of probability distributions. Redundant information between random variables is information that is shared between those variables. But in contrast to mutual information, redundant information denotes information that is shared about the outcome of a third variable. Formalizing this concept, and being able to measure it, is required for the non-negative decomposition of mutual information into redundant and synergistic information. Previous attempts to formalize redundant or synergistic information struggle to capture some desired properties. We introduce a new formalism for redundant information and prove that it satisfies all the properties necessary outlined in earlier work, as well as an additional criterion that we propose to be necessary to capture redundancy. We also demonstrate the behaviour of this new measure for several examples, compare it to previous measures and apply it to the decomposition of transfer entropy.Comment: 16 pages, 15 figures, 1 table, added citation to Griffith et al 2012, Maurer et al 199

    Machine learning and the physical sciences

    No full text
    Machine learning encompasses a broad range of algorithms and modeling tools used for a vast array of data processing tasks, which has entered most scientific disciplines in recent years. We review in a selective way the recent research on the interface between machine learning and physical sciences. This includes conceptual developments in machine learning (ML) motivated by physical insights, applications of machine learning techniques to several domains in physics, and cross-fertilization between the two fields. After giving basic notion of machine learning methods and principles, we describe examples of how statistical physics is used to understand methods in ML. We then move to describe applications of ML methods in particle physics and cosmology, quantum many body physics, quantum computing, and chemical and material physics. We also highlight research and development into novel computing architectures aimed at accelerating ML. In each of the sections we describe recent successes as well as domain-specific methodology and challenges

    Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

    Full text link
    Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.Comment: 16 pages, 3 figure

    Shannon Meets Carnot: Generalized Second Thermodynamic Law

    Full text link
    The classical thermodynamic laws fail to capture the behavior of systems with energy Hamiltonian which is an explicit function of the temperature. Such Hamiltonian arises, for example, in modeling information processing systems, like communication channels, as thermal systems. Here we generalize the second thermodynamic law to encompass systems with temperature-dependent energy levels, dQ=TdS+dTdQ=TdS+dT, where denotes averaging over the Boltzmann distribution and reveal a new definition to the basic notion of temperature. This generalization enables to express, for instance, the mutual information of the Gaussian channel as a consequence of the fundamental laws of nature - the laws of thermodynamics

    Causal blankets : Theory and algorithmic framework

    Get PDF
    Funding Information: F.R. was supported by the Ad Astra Chandaria foundation. P.M. was funded by the Wellcome Trust (grant no. 210920/Z/18/Z). M.B. was supported by a grant from Tem-pleton World Charity Foundation, Inc. (TWCF). The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of TWCF. Publisher Copyright: © 2020, Springer Nature Switzerland AG. This is a post-peer-review, pre-copyedit version of Rosas, F. E., Mediano, P. A. M., Biehl, M., Chandaria, S., & Polani, D. (2020). Causal blankets: Theory and algorithmic framework. In T. Verbelen, P. Lanillos, C. L. Buckley, & C. De Boom (Eds.), Active Inference - First International Workshop, IWAI 2020, Co-located with ECML/PKDD 2020, Proceedings (pp. 187-198). (Communications in Computer and Information Science; Vol. 1326). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-64919-7_19We introduce a novel framework to identify perception-action loops (PALOs) directly from data based on the principles of computational mechanics. Our approach is based on the notion of causal blanket, which captures sensory and active variables as dynamical sufficient statistics—i.e. as the “differences that make a difference.” Furthermore, our theory provides a broadly applicable procedure to construct PALOs that requires neither a steady-state nor Markovian dynamics. Using our theory, we show that every bipartite stochastic process has a causal blanket, but the extent to which this leads to an effective PALO formulation varies depending on the integrated information of the bipartition
    • …
    corecore